Chapter 21

Summarizing and Graphing Survival Data

IN THIS CHAPTER

Beginning with the basics of survival data

Generating life tables and trying the Kaplan-Meier method

Applying some handy guidelines for survival analysis

Using survival data for even more calculations

This chapter describes statistical techniques that deal with a special kind of numerical data called

survival data or time-to-event data. These data reflect the interval from a particular starting point in

time, such the date a patient receives a certain diagnosis or undergoes a certain procedure, to the first

or only occurrence of a particular kind of event that represents an endpoint. Because these techniques

are often applied to situations where the endpoint event is death, we usually call the use of these

techniques survival analysis, even when the endpoint is something less drastic (or final) than death.

Survival data could include time from resolution of a chronic illness symptom to its relapse, but it can

also be a desirable endpoint, such as time to remission of cancer, or time to recovery from an acute

condition. Throughout this chapter, we use terms and examples that imply that the endpoint is death,

such as saying survival time instead of time to event. However, everything we say also applies to

other kinds of endpoints.

You may wonder why you need a special kind of analysis for survival data in the first place. Why not

just treat survival times as ordinary numerical variables? Why not summarize them as means, medians,

standard deviations, and so on, and graph them as histograms and box-and-whiskers charts? Why not

compare survival times between groups with t tests and ANOVAs? Why not use ordinary least-squares

regression to explore how various factors influence survival time?

In this chapter, we explain how survival data aren’t like ordinary numerical data and why you need to

use specific techniques to analyze them properly. We describe two ways to construct survival curves:

the life-table and the Kaplan-Meier methods. We guide you in preparing and interpreting survival

curves and show you how to glean useful information from these curves, such as median survival time

and five-year survival rates.

Understanding the Basics of Survival Data

To understand survival analysis, you first have to understand survival data. Survival times are

intervals between a designated starting time point and the time point an event occurs. These intervals

have can have a specific type of missing data due to a phenomenon called censoring. Because survival

data usually include censored data, they must be analyzed in a very specific way to avoid generating

biased estimates that lead to incorrect conclusions.

Examining how survival times are intervals